AITopics | noise detection

Collaborating Authors

noise detection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

d95cb79a3421e6d9b6c9a9008c4d07c5-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 08:25:41 GMT

data mining, large language model, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China > Beijing > Beijing (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(2 more...)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Data Science > Data Quality (0.94)
(6 more...)

Add feedback

NoiseGPT: Label Noise Detection and Rectification through Probability Curvature

Neural Information Processing SystemsOct-10-2025, 18:26:00 GMT

Machine learning craves high-quality data which is a major bottleneck during realistic deployment, as it takes abundant resources and massive human labor to collect and label data. Unfortunately, label noise where image data mismatches with incorrect label exists ubiquitously in all kinds of datasets, significantly degrading the learning performance of deep networks. Learning with Label Noise (LNL) has been a common strategy for mitigating the influence of noisy labels.

dataset, experiment, noisegpt, (12 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China > Beijing > Beijing (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(2 more...)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(6 more...)

Add feedback

Exploring the Frontiers of kNN Noisy Feature Detection and Recovery for Self-Driving Labs

Shi, Qiuyu, Li, Kangming, Fehlis, Yao, Persaud, Daniel, Black, Robert, Hattrick-Simpers, Jason

arXiv.org Artificial IntelligenceJul-24-2025

Self-driving laboratories (SDLs) have shown promise to accelerate materials discovery by integrating machine learning with automated experimental platforms. However, errors in the capture of input parameters may corrupt the features used to model system performance, compromising current and future campaigns. This study develops an automated workflow to systematically detect noisy features, determine sample-feature pairings that can be corrected, and finally recover the correct feature values. A systematic study is then performed to examine how dataset size, noise intensity, and feature value distribution affect both the detectability and recoverability of noisy features. In general, high-intensity noise and large training datasets are conducive to the detection and correction of noisy features. Low-intensity noise reduces detection and recovery but can be compensated for by larger clean training data sets. Detection and correction results vary between features with continuous and dispersed feature distributions showing greater recoverability compared to features with discrete or narrow distributions. This systematic study not only demonstrates a model agnostic framework for rational data recovery in the presence of noise, limited data, and differing feature distributions but also provides a tangible benchmark of kNN imputation in materials data sets. Ultimately, it aims to enhance data quality and experimental precision in automated materials discovery.

artificial intelligence, machine learning, noisy feature, (15 more...)

arXiv.org Artificial Intelligence

2507.16833

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States (0.14)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.63)

Add feedback

Investigating the Generalizability of ECG Noise Detection Across Diverse Data Sources and Noise Types

Kalpande, Sharmad, Sahu, Nilesh Kumar, Lone, Haroon

arXiv.org Artificial IntelligenceFeb-20-2025

Electrocardiograms (ECGs) are essential for monitoring cardiac health, allowing clinicians to analyze heart rate variability (HRV), detect abnormal rhythms, and diagnose cardiovascular diseases. However, ECG signals, especially those from wearable devices, are often affected by noise artifacts caused by motion, muscle activity, or device-related interference. These artifacts distort R-peaks and the characteristic QRS complex, making HRV analysis unreliable and increasing the risk of misdiagnosis. Despite this, the few existing studies on ECG noise detection have primarily focused on a single dataset, limiting the understanding of how well noise detection models generalize across different datasets. In this paper, we investigate the generalizability of noise detection in ECG using a novel HRV-based approach through cross-dataset experiments on four datasets. Our results show that machine learning achieves an average accuracy of over 90\% and an AUPRC of more than 0.9. These findings suggest that regardless of the ECG data source or the type of noise, the proposed method maintains high accuracy even on unseen datasets, demonstrating the feasibility of generalizability.

artifact, dataset, ecg signal, (14 more...)

arXiv.org Artificial Intelligence

2502.14522

Country:

Asia > India > Madhya Pradesh > Bhopal (0.05)
Europe > Czechia > South Moravian Region > Brno (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

Luo, Junyu, Luo, Xiao, Ding, Kaize, Yuan, Jingyang, Xiao, Zhiping, Zhang, Ming

arXiv.org Artificial IntelligenceDec-19-2024

Supervised fine-tuning (SFT) plays a crucial role in adapting large language models (LLMs) to specific domains or tasks. However, as demonstrated by empirical experiments, the collected data inevitably contains noise in practical applications, which poses significant challenges to model performance on downstream tasks. Therefore, there is an urgent need for a noise-robust SFT framework to enhance model capabilities in downstream tasks. To address this challenge, we introduce a robust SFT framework (RobustFT) that performs noise detection and relabeling on downstream task data. For noise identification, our approach employs a multi-expert collaborative system with inference-enhanced models to achieve superior noise detection. In the denoising phase, we utilize a context-enhanced strategy, which incorporates the most relevant and confident knowledge followed by careful assessment to generate reliable annotations. Additionally, we introduce an effective data selection mechanism based on response entropy, ensuring only high-quality samples are retained for fine-tuning. Extensive experiments conducted on multiple LLMs across five datasets demonstrate RobustFT's exceptional performance in noisy scenarios.

arxiv preprint arxiv, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2412.14922

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Improving the Robustness of Summarization Models by Detecting and Removing Input Noise

Krishna, Kundan, Zhao, Yao, Ren, Jie, Lakshminarayanan, Balaji, Luo, Jiaming, Saleh, Mohammad, Liu, Peter J.

arXiv.org Artificial IntelligenceDec-4-2023

The evaluation of abstractive summarization models typically uses test data that is identically distributed as training data. In real-world practice, documents to be summarized may contain input noise caused by text extraction artifacts or data pipeline bugs. The robustness of model performance under distribution shift caused by such noise is relatively under-studied. We present a large empirical study quantifying the sometimes severe loss in performance (up to 12 ROUGE-1 points) from different types of input noise for a range of datasets and model sizes. We then propose a light-weight method for detecting and removing such noise in the input during model inference without requiring any extra training, auxiliary models, or even prior knowledge of the type of noise. Our proposed approach effectively mitigates the loss in performance, recovering a large fraction of the performance drop, sometimes as large as 11 ROUGE-1 points.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2212.09928

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > Ontario > Toronto (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Enhancing ECG Analysis of Implantable Cardiac Monitor Data: An Efficient Pipeline for Multi-Label Classification

Bleich, Amnon, Linnemann, Antje, Jaidi, Benjamin, Diem, Björn H, Conrad, Tim OF

arXiv.org Artificial IntelligenceJul-12-2023

Implantable Cardiac Monitor (ICM) devices are demonstrating as of today, the fastest-growing market for implantable cardiac devices. As such, they are becoming increasingly common in patients for measuring heart electrical activity. ICMs constantly monitor and record a patient's heart rhythm and when triggered - send it to a secure server where health care professionals (denote HCPs from here on) can review it. These devices employ a relatively simplistic rule-based algorithm (due to energy consumption constraints) to alert for abnormal heart rhythms. This algorithm is usually parameterized to an over-sensitive mode in order to not miss a case (resulting in relatively high false-positive rate) and this, combined with the device's nature of constantly monitoring the heart rhythm and its growing popularity, results in HCPs having to analyze and diagnose an increasingly growing amount of data. In order to reduce the load on the latter, automated methods for ECG analysis are nowadays becoming a great tool to assist HCPs in their analysis. While state-of-the-art algorithms are data-driven rather than rule-based, training data for ICMs often consist of specific characteristics which make its analysis unique and particularly challenging. This study presents the challenges and solutions in automatically analyzing ICM data and introduces a method for its classification that outperforms existing methods on such data. As such, it could be used in numerous ways such as aiding HCPs in the analysis of ECGs originating from ICMs by e.g. suggesting a rhythm type.

algorithm, artificial intelligence, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2307.07423

Country:

Europe > Portugal > Guarda > Guarda (0.04)
Europe > Montenegro (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Promising Solution (0.93)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning Cautiously in Federated Learning with Noisy and Heterogeneous Clients

Wu, Chenrui, Li, Zexi, Wang, Fangxin, Wu, Chao

arXiv.org Artificial IntelligenceApr-6-2023

Federated learning (FL) is a distributed framework for collaboratively training with privacy guarantees. In real-world scenarios, clients may have Non-IID data (local class imbalance) with poor annotation quality (label noise). The co-existence of label noise and class imbalance in FL's small local datasets renders conventional FL methods and noisy-label learning methods both ineffective. To address the challenges, we propose FedCNI without using an additional clean proxy dataset. It includes a noise-resilient local solver and a robust global aggregator. For the local solver, we design a more robust prototypical noise detector to distinguish noisy samples. Further to reduce the negative impact brought by the noisy samples, we devise a curriculum pseudo labeling method and a denoise Mixup training strategy. For the global aggregator, we propose a switching re-weighted aggregation method tailored to different learning periods. Extensive experiments demonstrate our method can substantially outperform state-of-the-art solutions in mix-heterogeneous FL environments.

artificial intelligence, machine learning, noisy sample, (12 more...)

arXiv.org Artificial Intelligence

2304.02892

Country:

Asia > China > Guangdong Province > Shenzhen (0.05)
Asia > China > Hong Kong (0.04)
Asia > China > Zhejiang Province (0.04)
Asia > China > Anhui Province (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Education (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Label noise detection under the Noise at Random model with ensemble filters

Moura, Kecia G., Prudêncio, Ricardo B. C., Cavalcanti, George D. C.

arXiv.org Artificial IntelligenceDec-2-2021

Label noise detection has been widely studied in Machine Learning because of its importance in improving training data quality. Satisfactory noise detection has been achieved by adopting ensembles of classifiers. In this approach, an instance is assigned as mislabeled if a high proportion of members in the pool misclassifies it. Previous authors have empirically evaluated this approach; nevertheless, they mostly assumed that label noise is generated completely at random in a dataset. This is a strong assumption since other types of label noise are feasible in practice and can influence noise detection results. This work investigates the performance of ensemble noise detection under two different noise models: the Noisy at Random (NAR), in which the probability of label noise depends on the instance class, in comparison to the Noisy Completely at Random model, in which the probability of label noise is entirely independent. In this setting, we investigate the effect of class distribution on noise detection performance since it changes the total noise level observed in a dataset under the NAR assumption. Further, an evaluation of the ensemble vote threshold is conducted to contrast with the most common approaches in the literature. In many performed experiments, choosing a noise generation model over another can lead to different results when considering aspects such as class imbalance and noise level ratio among different classes.

data quality, machine learning, noise detection, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.3233/IDA-215980

2112.01617

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
South America > Brazil > Pernambuco > Recife (0.04)
North America > United States > Wisconsin (0.04)
Asia > British Indian Ocean Territory > Diego Garcia (0.04)

Genre:

Research Report > New Finding (0.94)
Research Report > Experimental Study (0.70)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Deep Convolutional Neural Networks for Noise Detection in ECGs

John, Jennifer N., Galloway, Conner, Valys, Alexander

arXiv.org Machine LearningOct-4-2018

Mobile electrocardiogram (ECG) recording technologies represent a promising tool to fight the ongoing epidemic of cardiovascular diseases, which are responsible for more deaths globally than any other cause. While the ability to monitor one's heart activity at any time in any place is a crucial advantage of such technologies, it is also the cause of a drawback: signal noise due to environmental factors can render the ECGs illegible. In this work, we develop convolutional neural networks (CNNs) to automatically label ECGs for noise, training them on a novel noise-annotated dataset. By reducing distraction from noisy intervals of signals, such networks have the potential to increase the accuracy of models for the detection of atrial fibrillation, long QT syndrome, and other cardiovascular conditions. Comparing several architectures, we find that a 16-layer CNN adapted from the VGG16 network which generates one prediction per second on a 10-second input performs exceptionally well on this task, with an AUC of 0.977.

artificial intelligence, machine learning, noise, (19 more...)

arXiv.org Machine Learning

1810.04122

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback